Towards a discourse relation-aware approach for Chinese-English machine translation
نویسنده
چکیده
Translation of discourse relations is one of the recent efforts of incorporating discourse information to statistical machine translation (SMT). While existing works focus on disambiguation of ambiguous discourse connectives, or transformation of discourse trees, only explicit discourse relations are tackled. A greater challenge exists in machine translation of Chinese, since implicit discourse relations are abundant and occur both inside and outside a sentence. This thesis proposal describes ongoing work on bilingual discourse annotation and plans towards incorporating discourse relation knowledge to a ChineseEnglish SMT system with consideration of implicit discourse relations. The final goal is a discourse-unit-based translation model unbounded by the traditional assumption of sentence-to-sentence translation.
منابع مشابه
Assessing the Discourse Factors that Influence the Quality of Machine Translation
We present a study of aspects of discourse structure — specifically discourse devices used to organize information in a sentence — that significantly impact the quality of machine translation. Our analysis is based on manual evaluations of translations of news from Chinese and Arabic to English. We find that there is a particularly strong mismatch in the notion of what constitutes a sentence in...
متن کاملThe Use of Second-Person Reference in Advertisement Translation with Reference to Translation between Chinese and English
This research aimed to review the use of second-person reference in advertisement translation, work out the general rules, and provide guidance to translators. Using second-person reference is common in the advertising discourse. Addressing audiences directly involves their attention and in this way enhances their memorization of the advertised message. Second-person reference can be realized v...
متن کاملMachine Translation with Many Manually Labeled Discourse Connectives
The paper presents machine translation experiments from English to Czech with a large amount of manually annotated discourse connectives. The gold-standard discourse relation annotation leads to better translation performance in ranges of 4–60% for some ambiguous English connectives and helps to find correct syntactical constructs in Czech for less ambiguous connectives. Automatic scoring confi...
متن کاملUsing Sense-labeled Discourse Connectives for Statistical Machine Translation
This article shows how the automatic disambiguation of discourse connectives can improve Statistical Machine Translation (SMT) from English to French. Connectives are firstly disambiguated in terms of the discourse relation they signal between segments. Several classifiers trained using syntactic and semantic features reach stateof-the-art performance, with F1 scores of 0.6 to 0.8 over thirteen...
متن کاملMachine Translation of Labeled Discourse Connectives
This paper shows how the disambiguation of discourse connectives can improve their automatic translation, while preserving the overall performance of statistical MT as measured by BLEU. State-of-the-art automatic classifiers for rhetorical relations are used prior to MT to label discourse connectives that signal those relations. These labels are used for MT in two ways: (1) by augmenting factor...
متن کامل